Skip to content

Conversation

devin-ai-integration[bot]
Copy link
Contributor

@devin-ai-integration devin-ai-integration bot commented Jul 18, 2025

feat(executors): Add Java connector support with automatic JRE management

Summary

This PR adds comprehensive Java connector support to PyAirbyte by implementing a new JavaExecutor class that automatically downloads and manages Zulu JRE installations. The implementation follows the working logic from the Airbyte shell script and integrates seamlessly with PyAirbyte's existing executor framework.

Key Features:

  • Automatic JRE download and caching in ~/.airbyte/java/{os}-{arch}/ directories
  • Platform detection for Linux/macOS and x64/aarch64 architectures
  • Integration with Azul's API for Zulu 21 JRE downloads
  • Support for use_java_tar parameter with Path, str, bool, and None types
  • Fallback to system JAVA_HOME if available
  • Proper error handling and user-friendly messages

API Usage:

# With explicit tar path
source = ab.get_source("source-mssql", use_java_tar="/path/to/connector.tar")

# With auto-location (when implemented)
source = ab.get_source("source-mssql", use_java_tar=True)

Review & Testing Checklist for Human

This is a medium-risk change introducing new platform-specific functionality:

  • End-to-end testing: Build a Java connector (e.g., source-mssql) using gradlew and test the complete flow from tar installation to --spec execution
  • JRE download verification: Test JRE download on both Linux and macOS to ensure platform detection and Azul API integration work correctly
  • Parameter type handling: Verify all use_java_tar parameter types work as expected, especially the bool=True case for auto-location
  • Error handling validation: Test edge cases like network failures, invalid tar files, and unsupported platforms to ensure user-friendly error messages
  • Registry integration: Verify that Java connectors in the registry properly trigger JavaExecutor usage

Recommended test plan:

  1. Build source-mssql connector: cd ~/repos/airbyte && ./gradlew :airbyte-integrations:connectors:source-mssql:build
  2. Test with explicit path: ab.get_source("source-mssql", use_java_tar="path/to/airbyte-app.tar").spec()
  3. Verify JRE caching works by running twice and confirming no re-download
  4. Test on clean environment to validate full JRE download flow

Diagram

%%{ init : { "theme" : "default" }}%%
graph TD
    User["User calls<br/>ab.get_source()"]:::context
    Factory["airbyte/_executors/<br/>util.py"]:::major-edit
    JavaExec["airbyte/_executors/<br/>java.py"]:::major-edit
    Init["airbyte/_executors/<br/>__init__.py"]:::minor-edit
    
    AzulAPI["Azul JRE API"]:::context
    JRECache["~/.airbyte/java/<br/>{os}-{arch}/"]:::context
    ConnectorTar["Connector TAR file"]:::context
    
    User -->|"use_java_tar param"| Factory
    Factory -->|"creates JavaExecutor"| JavaExec
    JavaExec -->|"downloads JRE"| AzulAPI
    JavaExec -->|"caches to"| JRECache
    JavaExec -->|"extracts & runs"| ConnectorTar
    
    subgraph Legend
        L1[Major Edit]:::major-edit
        L2[Minor Edit]:::minor-edit  
        L3[Context/No Edit]:::context
    end
    
    classDef major-edit fill:#90EE90
    classDef minor-edit fill:#87CEEB
    classDef context fill:#FFFFFF
Loading

Notes

  • Platform support: Currently limited to Linux/macOS with x64/aarch64 architectures. Windows support would require additional implementation.
  • JRE version: Hardcoded to Zulu 21 based on Airbyte's current requirements. Future versions may need parameterization.
  • Auto-location: The use_java_tar=True case currently sets path to None - full auto-location logic may need additional implementation based on registry metadata.
  • Performance: JRE downloads are ~50-100MB and cached permanently. First-time setup will be slower but subsequent runs should be fast.

Session Info:

devin-ai-integration bot and others added 2 commits July 18, 2025 00:24
…ment

- Add JavaExecutor class for managing Java connectors
- Implement automatic JRE download and caching in ~/.airbyte/java
- Support Zulu 21 JRE with dynamic OS/architecture detection
- Integrate with existing executor factory pattern
- Remove Java connector warning from registry
- Add support for connector tar file execution
- Tested with source-mssql connector --spec command

Co-Authored-By: AJ Steers <[email protected]>
Copy link
Contributor Author

Original prompt from AJ Steers:

@Devin - Let's add a 'java' module to PyAirbyte. If we are asked to install or run a java connector, we'll do the following:
1. Check if we've already downloaded a JRE.
2. If not, download a JRE to ~/.airbyte/java
3. Assume we need Zulu 21, but we'll dynamically get the OS and architecture at runtime.
4. Refer to the below script for invocation logic that appears to be working as of now for our use case.

Working logic is here as a shell script. You can copy the logic for this, but migrate to Python: <https://github.com/airbytehq/airbyte/pull/63367/files>

Copy link
Contributor Author

🤖 Devin AI Engineer

I'll be helping with this pull request! Here's what you should know:

✅ I will automatically:

  • Address comments on this PR. Add '(aside)' to your comment to have me ignore it.
  • Look at CI failures and help fix them

Note: I can only respond to comments from users who have write access to this repository.

⚙️ Control Options:

  • Disable automatic comment and CI monitoring

devin-ai-integration bot and others added 4 commits July 18, 2025 00:40
- Update get_connector_executor to accept use_java_tar parameter
- Support Path, str, bool, and None types for use_java_tar
- Implement logic for bool True/False similar to docker_image=True
- Address GitHub PR feedback from @aaronsteers

Co-Authored-By: AJ Steers <[email protected]>
- Rename connector_tar_path to use_java_tar in JavaExecutor.__init__
- Update parameter documentation
- Prepare for bool value handling logic

Co-Authored-By: AJ Steers <[email protected]>
- Support Path, str, bool, and None types for use_java_tar
- Add logic to handle True/False values for auto-location
- Update type hints and documentation
- Complete implementation of GitHub PR feedback

Co-Authored-By: AJ Steers <[email protected]>
- Update JavaExecutor call to use use_java_tar parameter
- Fix mypy type checking error
- Complete implementation of GitHub PR feedback

Co-Authored-By: AJ Steers <[email protected]>
- Remove requirement for use_java_tar when no install method specified
- Java connectors now fall back to Docker like other connector types
- Address GitHub PR feedback from @aaronsteers

Co-Authored-By: AJ Steers <[email protected]>
pip_url = metadata.pypi_package_name
pip_url = f"{pip_url}=={version}" if version else pip_url
case InstallType.JAVA:
docker_image = True
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's make this a bit more robust...

When default install type is Java:

  1. If use_java_tar is False: fallback to Docker.
  2. If use_java_tar is None and Docker is available: use Docker.
  3. Else (use_java_tar is Truthy): use Java.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I forgot to mention the case where use_java_tar is None and Docker is available (2b). In that case, we can default to Docker (for now) as the more stable and proven option when Docker is available.

Document these cases for the user within the docstring of the public function(s)/method(s), and within the file docstring for the java module.

… documentation

- Add comprehensive fallback logic for Java connectors in get_connector_executor
- When use_java_tar=False: explicitly use Docker
- When use_java_tar=None and Docker available: prefer Docker for stability
- When use_java_tar=None and Docker unavailable: fallback to Java with auto-detection
- When use_java_tar is truthy: use Java executor
- Add detailed docstring documentation explaining fallback behavior
- Add module-level docstring to java.py explaining fallback logic
- Import is_docker_installed at module level for clean code structure

Addresses GitHub PR comment requesting more robust fallback logic and documentation.

Co-Authored-By: AJ Steers <[email protected]>
Copy link

github-actions bot commented Jul 18, 2025

PyTest Results (Fast Tests Only, No Creds)

729 tests  +1   729 ✅ +1   18m 16s ⏱️ +24s
  1 suites ±0     0 💤 ±0 
  1 files   ±0     0 ❌ ±0 

Results for commit f1d3730. ± Comparison against base commit a31830c.

♻️ This comment has been updated with latest results.

Copy link

PyTest Results (Full)

791 tests  +1   777 ✅ +1   30m 34s ⏱️ -5s
  1 suites ±0    14 💤 ±0 
  1 files   ±0     0 ❌ ±0 

Results for commit f1d3730. ± Comparison against base commit a31830c.

@aaronsteers aaronsteers marked this pull request as draft July 18, 2025 17:08
Copy link
Contributor Author

Closing due to inactivity for more than 7 days. Configure here.

devin-ai-integration bot added a commit that referenced this pull request Aug 11, 2025
- Add JavaExecutor with automatic JRE management and TAR extraction
- Add use_java parameter (None/True/False/Path) for Java execution control
- Add use_java_tar parameter (None/Path) for connector TAR file location
- Implement fallback logic: use_java_tar implies use_java=True when set
- Add comprehensive documentation and error handling
- Create source-snowflake example demonstrating Java connector usage
- Copy implementation from PR #719 with updated dual-parameter API

Requested by: @aaronsteers

Co-Authored-By: AJ Steers <[email protected]>
@aaronsteers
Copy link
Contributor

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant